On Indexing Handwritten Text

نویسنده

  • Ibrahim Kamel
چکیده

This paper deals with one of the new emerging multimedia data types, namely, handwritten cursive text. The paper presents two indexing methods for searching a collection of cursive handwriting. The first index, called word-level index, treats word as pictogram and uses global features for representing the cursive words and their retrieval. Each word (or stroke) can be described with a set of features and, thus, can be stored as points in the feature space. The Karhunen-Loѐve transform is then used to minimize the number of features used (data dimensionality) and thus the index size. Feature vectors are stored in an R-tree. The second index, called stroke-level index, treats the word as a set of strokes. We implemented both indexes and carried many simulation experiments to measure the effectiveness and the cost of the search algorithm. The proposed indexes achieve substantial saving in the search time over the sequential search. Moreover, the proposed indexes improve the matching rate up to 46% over the sequential search. The word-level index is suitable for large collection of cursive text. The stroke-level index is more accurate than the word-level index, but the stroke-level index is more costly than the word-level index in terms of the search time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine

Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...

متن کامل

Indexing and retrieval of handwritten medical forms

POSTER PAPER. This paper proposes an approach of indexing and retrieving degraded handwritten documents. We present a modified version of the popular Vector Model in information retrieval (IR). Our model incorporates top n candidates from a HR system into the scheme of calculating the term frequency (tf) and the inverted document frequency (idf). Standardized IR Tests show that the proposed app...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Interactive Smoothing of Handwritten Text Images Using a Bilateral filter

The use of digital images of handwritten historical documents has become more popular in recent years. Volunteers around the world now read thousands of these images as part of their indexing process. Handwritten text images of old documents are sometimes difficult to read or noisy due to the preservation of the document and quality of the image. In this paper, we present a technique that allow...

متن کامل

Handwritten Text Recognition for Ancient Documents

Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems ...

متن کامل

Using a Hidden-Markov Model in Semi- Automatic Indexing of Historical Handwritten Records

Indexing of historical records is a process that uses human effort to read text images and convert them into a machine readable format that facilitates search. The Church of Jesus Christ of Latter-day Saints has been using volunteers to index millions of microfilm images of genealogy records collected throughout the world. This indexing process is time-consuming. We adapt a technique for holist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010